SemanticScuttle - klotz.me » Tags: mistral+llm+machine learning

Tags: mistral* + llm* + machine learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

The Big LLM Architecture Comparison

A detailed comparison of the architectures of recent large language models (LLMs) including DeepSeek-V3, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen3, SmolLM3, and Kimi 2, focusing on key design choices and their impact on performance and efficiency.

2025-07-19 Tags: llm, large language models, deep learning, ai, architecture, deepseek, olmo, gemma, mistral, llama, qwen, smollm, kimi, moe, attention, transformers by klotz
mistral-finetune - GitHub

A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.

2024-06-06 Tags: github, mistral, lora, python, machine learning, fine tuning, llm by klotz

First / Previous / Next / Last / Page 1 of 0